Cross-layer transmission realized by light-emitting memristor for constructing ultra-deep neural network with transfer learning ability

Deep neural networks have revolutionized several domains, including autonomous driving, cancer detection, and drug design, and are the foundation for massive artificial intelligence models. However, hardware neural network reports still mainly focus on shallow networks (2 to 5 layers). Implementing deep neural networks in hardware is challenging due to the layer-by-layer structure, resulting in long training times, signal interference, and low accuracy due to gradient explosion/vanishing. Here, we utilize negative ultraviolet photoconductive light-emitting memristors with intrinsic parallelism and hardware-software co-design to achieve electrical information’s optical cross-layer transmission. We propose a hybrid ultra-deep photoelectric neural network and an ultra-deep super-resolution reconstruction neural network using light-emitting memristors and cross-layer block, expanding the networks to 54 and 135 layers, respectively. Further, two networks enable transfer learning, approaching or surpassing software-designed networks in multi-dataset recognition and high-resolution restoration tasks. These proposed strategies show great potential for high-precision multifunctional hardware neural networks and edge artificial intelligence.


Construction of UPENN
In the construction of UPENN, we first designed two kinds of cross-layer transmission bottleneck structures, namely BTNK 1 and BTNK 2, and set the first layer convolution result of each BTNK 1 as cross-layer information to simulate the optical and electrical output of the device.This information has three purposes, one was for layer-by-layer transmission, the other was to accumulate the results after convolution processing and layer-by-layer transmission as the final output of the bottleneck structure and entered the next structure.Third, the information was directly transmitted to the last BTNK 2 of this Stage for accumulation operation.Finally, a neural network with 53 convolution layers and a fully connected layer was realized, and the network included the corresponding pooling layer, BN layer and 8 crossregional transmission layers.
The structure of Contrast Net was like UPENN, but all the cross-layer transmission parts were removed, and only the network framework of layer-by-layer transmission was retained.

Construction of USRNN
In the construction of USRNN, we first designed ClBlock with BTNKs.Furthermore, the first-layer convolutional results within each ClBlock are designated to carry interlayer information, simulating the light output and electrical output of the device.
These information serve three purposes: firstly, for inter-layer transmission; secondly, to add the convolutional results to the output of the ClBlock as the final output of BTNK, which then enters the next ClBlock.Thirdly, to carry the convolutional results across 15 ClBlocks and all ClBlocks, and to accumulate them with the results of the 16th and the last ClBlock, thereby preventing gradient vanishing and exploding.
Finally, the variation in the number of layers in the upsample module enables the realization of USRNN with 133 and 135 layers.

Pre-learning and Re-learning of Network
1.The models of two networks were built by Python, used ImageNet dataset in the pre-learning process, and obtained network weights.After that, we selected eight different data sets on Kaggle to verify the migration learning ability of the networks.Firstly, the test set of each dataset was used to test the two networks directly, and the corresponding accuracy was obtained.Then, the neural network was retrained with the trainset of 8 datasets, and the network was evaluated with the test set.Because UPENN has obtained good network parameters in the pre-training, in the re-learning, we frozen most of the network structures, and only allowed the last BTNK structure and the full connection layer to adjust.As for Contrast Net, we still allow it to train for the whole network to see if it can get better results.Polymer poly (4-vinyl phenol) (PVP) is a dielectric material, the polar groups contained in the PVP side group contain enormous amount of deep traps that allow charging and decharging carriers upon applied voltage.Therefore, in the previous report, this is the main reason for the hysteresis of OFETs with PVP as a dielectric layer [Appl.Phys.Lett. 89, 262120 (2006); Appl.Phys.Lett. 93, 143302 (2008); Appl.Phys.Lett. 108, 173301 (2016)].Here, we exploit this feature to achieve the characteristics of artificial synapses by embedding capture layer PVP in QLED.Part of the holes can be captured by the PVP layer when bias is applied (I).Then, when the bias is removed, the holes are stored in the PVP layer (II).When bias is applied again, trapped holes are released under the action of an applied electric field (III), thus increasing the conductivity and brightness of the device and achieving a simulation of synaptic plasticity.In the dark state, the Fermi level of the device is in equilibrium.Due to UV stimulation, a large number of electron-hole pairs are generated in IDTBT, but the hole concentration hardly changes, and the electron concentration changes greatly.According to Fermi Dirac distribution function, the increase of electron concentration leads to the increase of the Fermi level of materials(∆ E f1), and the Fermi level between materials is rebalanced, which leads to the increase of the potential barrier (∆) and further hinders the hole transmission.Figure .12The I-V curves of organic field effect transistor with and without UV light.The optical power of 3% and 5% is 1 mW/cm 2 and 3 mW/cm 2 respectively.

2.
The training of the model with a 2x resolution restoration structure in USRNN begins using the training set from the DIV2K dataset.Upon completion of this training phase, the weights of all ClBlocks are frozen, and the Upsample module is modified for 4x resolution restoration before proceeding with further training.

Figure
Figure.S1 Ultraviolet-visible absorption spectra of materials in devices.

Figure
Figure.S2 The emission spectrum of light-emitting memristor.

Figure. S3
Figure.S3 The working mechanism of PVP charge trapping layer.Part of the holes can be captured by the PVP layer when bias is applied (I).Then, when the bias is removed, the holes are stored in the PVP layer (II).When bias is applied again, trapped holes are released under the action of an applied electric field (III), thus increasing the conductivity and brightness of the device and achieving a simulation of synaptic plasticity.

Figure
Figure.S4 Schematic diagram of recording EPSB signal with oscilloscope and photodetector.

Figure
Figure.S5 Pearson correlation analysis of postsynaptic current and postsynaptic brightness.

Figure. S7
Figure.S7 The conductance enhancement and suppression curves obtained by applying continuous electrical pulse stimulation and UV stimulation.

Figure. S8
Figure.S8The current variation graphs of the 9 devices.By applying 50 continuous 5 V electrical pulses with an interval of 50 ms, followed by a 3 s exposure to both dark and ultraviolet light conditions, then the electrical pulse stimulation was administered again.

Figure
Figure.S9 Schematic diagram of built-in electric field distribution generated by device under UV simulated by software, with arrow as electric field direction and scale unit as V/m.

Figure. S10
Figure.S10 Schematic diagram of barrier change caused by Fermi energy level change.

Figure .
Figure.S11 Schematic diagram of organic field effect transistor with PIN structure.

Figure
Figure.S13 Schematic diagram of photo-induced electric field and conductance change of OTFT.

Figure. S14
Figure.S14 The current reduction of transistor under UV irradiation.

Figure
Figure.S15 The decay time required by light-emitting memristor after 50 impulses stimulation.

Figure. S16
Figure.S16 Schematic diagram of resetting weight of light-emitting memristor after 50 pulse stimulation by UV irradiation.

Figure .
Figure.S17 Multi-cycle repetition of device reset operation after 50 pulse electric pulses.

Figure
Figure.S18 Lighting time required for resetting with different pulse numbers (at 70 mW/cm 2 ).

Figure
Figure.S20 Detailed structural diagram of UPENN.Stage 1 to Stage 4 are different structures of ClBlock.ClBlock consists of BTNK 1 and BTNK 2, where BTNK 1 is the transmitting end of the optical interlayer signal, and BTNK 2 is the receiving end of the optical interlayer signal.BTNK 2, which is not indicated by the red arrow, does not receive optical signals crossing over the stage.

Figure. S21
Figure.S21 The structural schematic diagram of BTNK used in UPENN and the position of gradient distribution obtained by calculation in Figure 5b.

Figure
Figure.S22 Detailed structural diagram of Contrast Net.

Figure. S23
Figure.S23 The structural schematic diagram of BTNK used in Contrast Net and the position of gradient distribution obtained by calculation in Figure 5c.

Figure .
Figure.S24 The feature maps of two X-ray images in UPENN and the SSIM of two feature maps in stage 4. (The feature maps are obtained after the first convolution in each stage of the network.)Figure.S19 shows the feature maps obtained from stage 1 to stage 4 in UPENN for the Xray images labeled as negative and positive in a binary classification task.In an ideal scenario, the neural network should extract completely different features for negative and positive cases to achieve close to 100% accuracy in distinguishing all images into two categories.It is evident from the figure that the two images acquire different features under the influence of the same convolutional kernels (the convolutional kernels are consistent at each position of the feature map), particularly in stage 3 and stage 4.This indicates that UPENN can extract features by successfully avoiding gradient vanishing through cross-layer transmission and obtaining effective convolutional kernels through pre-training.By using SSIM to analyze the similarity between the two feature maps in stage 4, the SSIM coefficient is only 0.052, indicating that the two maps are extremely dissimilar.Therefore, UPENN is capable of effectively distinguishing between negative and positive cases.

Figure .
Figure.S25 The feature maps of two X-ray images in Contrast Net and the SSIM of two feature maps in stage 4. (The feature maps are obtained after the first convolution in each stage of the network.)Figure.S20 shows the feature maps obtained from stage 1 to stage 4 in Contrast Net for the X-ray images labeled as negative and positive in a binary classification task.In an ideal scenario, the neural network should extract completely different features for negative and positive cases to achieve close to 100% accuracy in distinguishing all images into two categories.It is obvious from the figure that the two images have obtained very similar features under the influence of the same convolution kernel.This indicates that the convolution kernel of Contrast Net cannot effectively distinguish the image features of the two cases, which is essentially due to the invalid convolution kernel obtained in the pre-learning process caused by the gradient vanishing.By using SSIM to analyze the similarity between the two characteristic graphs in Stage 4, the SSIM coefficient is 0.925, which indicates that the two graphs are very similar.Therefore, Contrast Net cannot effectively distinguish negative and positive cases.

Figure
Figure.S26 The ClBlock for constructing USRNN and structure of USRNN.

Figure
Figure.S27 Partial repair details of Figure.6b.

Figure
Figure.S28 Partial repair details of Figure.6c.

Figure
Figure.S29 Partial repair details of Figure.6d.